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Abstract 

(N ; 

This paper considers the problem of adaptive estimation of a template in a randomly 
O ' shifted curve model. Using the Fourier transform of the data, we show that this problem can 

be transformed into a stochastic linear inverse problem. Our aim is to approach the estimator 
that has the smallest risk on the true template over a finite set of linear estimators defined 
in the Fourier domain. Based on the principle of unbiased empirical risk minimization, we 
r— i ■ derive a nonasymptotic oracle inequality in the case where the law of the random shifts is 

known. This inequality can then be used to obtain adaptive results on Sobolev spaces as 
' the number of observed curves tend to infinity. Some numerical experiments are given to 

^ (— | illustrate the performances of our approach. 

. 

Keywords: Template estimation, Curve alignment, Stochastic inverse problem, Oracle inequality, 
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> 1 1 Introduction 

: 

\ 1.1 Model and objectives 

The goal of this paper is to study a special class of stochastic inverse problems. We consider 
the problem of estimating a curve /, called template or shape function, from the observations of 
Q\ \ n noisy and randomly shifted curves Y\, . . .Y n coming from the following Gaussian white noise 

model: 

>! dYj (x) = f(x -Tj)dx + edWj(x), x 6 [0,1], j = l,...,n (1.1) 

where Wj are independent standard Brownian motions on [0, 1] , e represents a level of noise 
common to all curves, the Tj's are unknown random shifts, / is the unknown template to recover, 
and n is the number of observed curves that may be let going to infinity to study asymptotic 
properties. This model is realistic in many situations where it is reasonable to assume that 
the observed curves represent replications of almost the same process and when a large source 
of variation in the experiments is due to transformations of the time axis. Such a model is 
commonly used in many applied areas dealing with functional data such as neuroscience (see e.g. 
|IRT08| ) or biology (see e.g. |Ron98j b A well known problem in functional data analysis is the 
alignment of similar curves that differ by a time transformation to extract their common features, 
and (|1.1|) is a simple model where / represents such common features (see |RS02| . [RSQ5] for a 
detailed introduction to curve alignment problems in statistics). 

The function / : R — ► R is assumed to be of period 1 so that the model (|l.ip is well defined, 
and the shifts Tj are supposed to be independent and identically distributed (i.i.d.) random 
variables with density g : R — > R with respect to the Lebesgue measure dx on R. Estimating / 



can be seen as a stochastic inverse problem as this template is not observed directly, but through 
n independent realizations of the stochastic operator A T : Lp([0, 1]) — > Lp([0, 1]) defined by 

A T (f)(x) = f(x-T), xE [0,1], 

where Lp([0, 1]) denotes the space of squared integrable functions on [0, 1] with period 1, and r 
is random variable with density g. The additive Gaussian noise makes this problem ill-posed, 
and |BG09| have shown that estimating / in such models is in fact a deconvolution problem 
where the density g of the random shifts plays the role of the convolution operator. For the I? 
risk on [0, 1], [BG09J have derived the minimax rate of convergence for the estimation of / over 
Besov balls as n tends to infinfity. This minimax rate depends both on the smoothness of the 
template and on the decay of the Fourier coefficients of the density g. This is a well known fact 
for standard deterministic deconvolution problem in statistics, see e.g. |Fan91| . |Don95| . but the 
results in [BG 09J represent a novel contribution and a new point of view on template estimation 
in stochastic inverse problems such as (|l.ip . 

However, the approach followed in |BG09j is only asymptotic, and the main goal of this paper 
is to derive non-asymptotic results to study the estimation of / by keeping fixed the number n 
of observed curves. 



1.1.1 Deconvolution formulation 

Let us first explain how the model (II. lh can be transformed into a deconvolution problem as the 
one studied in [DJKP95] , Denote G the following density function defined on [0; 1] as 

G(x)=Y,9(x + k). 

fcez 

The density G exists as soon as g satisfies the weak condition g(x) < for any v > 1 and 

suitable constant C. Note that the Fourier coefficients of G are given by 

rl poo 

/ G(t)e~ i27Tlt dt= / g(t)e~ i2M dt = -fi 

JO J-oo 

Consider now the 1-periodization of / extended to M, one has 

1 />oo 

f(x-r)G(r)dr= / f(x - r)g{r)dr. 

J — oo 

The observations Yj can be written as 

dYj(x) = f * G{x)dx + ^{x)dx + edWj(x), (1.2) 

where £j is a second noise term defined as £j(x) = f(x — Tj) — f-kG(x). Hence, our model can be 
seen as a deconvolution problem with a noisy operator H : f f * G + £ and a more classical 
independent additive noise W. Note also that the realizations Hj : f i— ► / * G + £j are unbiased 
realizations of the operator H but presents a variance term which depends on the function / we 
want to estimate. This appears to be a new setting in the field of inverse problem with unknown 
operators as considered in |CH05j . jEKOlj . |HR05j . |Marf)6| and |CR07| . 

We will see in the sequel that the additive noise £ which depends on / slightly modifies the 
quadratic risk and the way to estimate / when compared to classical procedures used in standard 
inverse problems with a deterministic operator. 
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1.2 Fourier Analysis and an inverse problem formulation 

Supposing that / 6 Lp([0, 1]), we denote by 9k its A;*' 1 Fourier coefficient, namely: 

0k = f e- 2ik ™f(x)dx. 
Jo 

In the Fourier domain, the model can be rewritten as 



Cj,k ■= J 



e 



-2iknx dY .^ = &ke -i2KkTi + eZkj ( L3 ) 



where z k ,j are i.i.d. J\fc (0, 1) variables, i.e. complex Gaussian variables with zero mean and 
such that E|zfcj| 2 = 1. This means that the real and imaginary parts of the Zk,j 's are Gaussian 
variables with zero mean and variance 1/2. Thus, we can compute the sample mean of the k th 
Fourier coefficient over the n curves as 



n 

1 - 



i n e 

ECk,j = Oklk + —j=ik, (1.4) 
" . In 



where 

1 n 

n ^— ' 

and the ^'s are i.i.d. complex Gaussian variables with zero mean and variance 1. The Fourier 
coefficients in equation (II. 4h can be viewed as observations coming from a statistical inverse 
problem. Indeed, the standard sequence space model of an ill-posed statistical inverse problem 
is (see |CGPT02| and the references therein) 

Cfe = 0fc7fc + crz k , (1.6) 

where the 7fc's are eigenvalues of a known linear operator, z\. are random noise variables and a is 
a level of noise which goes to zero for studying asymptotic properties. The issue in such models 
is to recover the coefficients Ok from the observations Ck under various conditions on the decay 
to zero of the 7fc's as \k\ — > +oo. A large class of estimators for the problem (j 1 . 6 f) can be written 

as 

Ok = Afc — , 

Ik 

where A = (Xk)k£Z is a sequence of reals called filter. Various estimators of this form have been 
studied in a number of papers, and we refer to |CGPT02 j for more details. 

In a sense, we can view equation (|1.4|) as an inverse problem (with a = ^=) where the 
eigenvalues of the linear operator are the Fourier coefficients of the density g of the shifts i.e. 



Ik ■-- 



E ( e -^ k A = / e - i2lTkx g(x)dx. 

^ ' J — oo 



Indeed, let us assume that the density g of the random shifts is known. In this case, to 
estimate the Fourier coefficients of /, one can perform a deconvolution step of the form 

&k = Afc — , (1.7) 

Ik 

where c& is defined in (|1.4p and A = (Xk)kez is a filter whose choice will be discussed later on. 
Theoretical properties and optimal choices for the filter A will be presented in the case where the 
coefficients 7^ are known. Such a framework is commonly used in inverse problems such as (|1.6|) 
to obtain consistency results and to study asymptotic rates of convergence, where it is generally 
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supposed that the law of the additive error is Gaussian with zero mean and known variance 
a 2 , see e.g [CGPT02] , In model (II, the random shifts may be viewed as a second source of 
noise and for the theoretical analysis of this problem the law of this other random noise is also 
supposed to be known. 

Recently, some papers have addressed the problem of regularization with partially known 
operator. For instance, |CH05j consider the case where the eigenvalues are unknown but inde- 
pendently observed. They deal with the model: 

Ck = IkOk + e£fc, 7fe = 7fc + o-r] k , \/k G N, (1.8) 

where (£fc)fceN an d (Vk)k£N denote i.i.d standard gaussian variables. In this case, each coefficient 
9k can be estimated by jZ cj.. Similar models have been considered in |CR07j . |Mar06j or 
|Mar09| . In a more general setting, we may refer to [EKOlj and |HR05| . 

In this paper, our framework is sligthly different in the sense that the operator is stochastic, 
but the regularization is operated using deterministic eigenvalues. Hence the approach followed 
in the previous papers is no directly applicable to model (11.11) . We believe that estimating / in 
model (|1.1|) without the knowledge of g remains a difficult task, and this paper is a first step to 
address this issue. 

1.3 Previous work in template estimation and shift recovery 

The problem of estimating the common shape of a set of curves that differ by a time trans- 
formation is usually referred to as the curve registration problem, and it has received a lot of 
attention in the literature over the last two decades. Among the various methods that have 
been proposed, one can distinguish between landmark-based approaches which aim at aligning 
common structural points of the curves (typically locations of extrema) see e.g |GK95| . |GK92j . 
|Big06| , and nonparametric modeling of the warping functions to align a set of curves see e.g 
|RL01j . |WG97j . |LM04j . However, in these papers, studying consistent estimates of the common 
shape / as the number of curves n tends to infinity is generally not considered. 

In the simplest case of shifted curves, various approaches have been developed. Self-modelling 
regression methods proposed by |KG88| are semiparametric models where each observed curve is 
a parametric transformation of a common regression function. Such models are usually referred to 
as shape invariant models and estimation in this setting is usually done by iterating the following 
two steps: estimation of the parameters of the transformations (here the shifts) given a reference 
curve, and nonparametric estimation of a template by aligning the observed curves given a set of 
known transformation parameters. |KG88| studied the consistency of such a two steps procedure 
in an asymptotic framework where both the number of functions n and the number of observed 
points per curves grows to infinity. Due to the asymptotic equivalence between the white noise 
model and nonparametric regression with an equi-spaced design (see |BL96j ). such an asymptotic 
framework in our setting would correspond to the case where both n tends to infinity and e is 
let going to zero. In this paper we prefer to focus only on the case where n may be let going to 
infinity, and to leave fixed the level of additive noise in each observed curve. 

Based on a model with curves observed at discrete time points, semiparametric estimation of 
the shifts and the shape function is proposed in |LMGQ7| and |Vim08| as the number of obser- 
vations per curve grows, but with a fixed number n of curves. A generalisation of this approach 
for the estimation of scaling, rotation and translation parameters for two-dimensional images is 
also proposed in [BGV08] . but also with a fixed number of observed images. Semiparametric 
and adaptive estimation of a shift parameter in the case of a single observed curve in a white 
noise model is also considered by |DGT06| and |DalQ7| . Estimation of a common shape for 
randomly shifted curves and asymptotic in n is considered in |Ron98j from the point of view of 
semiparametric estimation when the parameter of interest is infinite dimensional. 

However, in all the above cited papers rates of convergence or oracle inequalities for the 
estimation of the template are generally not studied. Moreover, our procedure differs from the 
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approaches classically used in curve registration as our estimator is obtained in only one very 
simple step, and it is not based on an alternative scheme between estimation of the shifts and 
averaging of back-transformed curves given estimated values of the shifts parameters. 

Finally, note that |CL08j and |IRT08j consider a model similar to (II. lj) . but they rather focus 
on the the estimation of the density g of the shifts as n tends to infinity. Using such an approach 
could be a good start for studying the estimation of the template / without the knowledge of g. 
However, we believe that this is far beyond the scope of this paper, and we prefer to leave this 
problem open for future work. 

1.4 Organization of the paper 

In Section [21 we consider an estimator of the shape function / based on spectral cut-off when 
the eigenvalues 7^ are known. Based on the principle of unbiased risk minimization developed 
by [CGPT02) . we derive an oracle inequality that is then used to derive an adaptive estimator 
of / on Sobolev spaces. This estimator is based on the Fourier transform of the curves with a 
data-based choice of the frequency cut-off. In Section [31 we study asymptotic properties of this 
estimator in terms of minimax rates of converge over Sobolev balls. Finally in Section HI a short 
simulation study is proposed to illustrate the numerical properties of the estimator. All proofs 
are deferred to a technical section at the end of the paper. 

2 Estimation of the common shape 

In the following, we assume that the Fourier coefficients ■jk are known. In this situation it is 
possible to choose a data-dependent filter A* that mimic the performances of an optimal filter A 
called oracle that would be obtained if we knew the true template /. The performances of this 
filter are related to the performances of the filter A via an oracle inequality. In this section, most 
of our results are non-asymptotic and are thus related to the approach proposed in |CGPT02] 
to study standard statistical inverse problems via oracle inequalities. 

2.1 Smoothness assumptions for the density g 

In a deconvolution problem, it is well known that the difficulty of estimating / is quantified by 
the decay to zero of the 7fc's as \k\ — > +00. Depending how fast these Fourier coefficients tend 
to zero as — ► +00, the reconstruction of / will be more or less accurate. This phenomenon 
was systematically studied by |Fan91j in the context of density deconvolution. In this paper, the 
following type of assumption on g is considered: 

Assumption 2.1 The Fourier coefficients of g have a polynomial decay i.e. for some real /3 > 0, 
there exists two constants C max > C rn i n > such that for all k € Z 

Cmin\k\- P < \lk\ < C max \k\-f. (2.1) 

Remark that the knowledge of the constants C max ,C m i n and (3 will not be necessary for the 
construction of our estimator. 

2.2 Risk decomposition 

Assuming that 7fc / for all k £ Z, we recall that an estimator of the O^s is given by, see 
equation (|1.7p 




where A = (Xk)keZ is a real sequence. Examples of commonly used filters include projection 
weights Afc = 1L\k\<N f° r some integer A, and the Tikhonov weights A& = 1/(1 + (\k\/v2) Ul ) for 
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some parameters v\ > and v% > 0. Based on the 9 k s, one can estimate the signal / using the 
Fourier reconstruction formula. 

The problem is then to choose the sequence (\k)kez in an optimal way with respect to an 
appropriate risk. For a given filter A we use the classical ^2-norm to define the risk of the 
estimator 0(A) = (0fc)fcez 

R(9,X) =E||0(A)-0|| 2 = E^|0 fc -0 fc | 2 (2.2) 

fcez 

Note that analyzing the above risk (|2.2|) is equivalent to analyze the mean integrated square 
risk R(f\, f) = E|| A-/|| 2 = E (jJCACz) - f(x)) 2 dx) for the estimator f x (x) = £ fceZ k e~ 2ik ™ . 
The following lemma gives the bias- variance decomposition of R(X,9). 

Lemma 2.1 For any given nonrandom filter X, the risk of the estimator (A) can be decomposed 



as 



R(9, A) = J> - l) 2 |0 fc | 2 + \ £ + \ £ fA^.I 2 (r4 - l) 

V v ' V „ ' V «, 

Bias Vi V 2 



(2.3) 



For a fixed number of curves n and a given shape function /, the problem of choosing an optimal 
filter in a set of possible candidates is to find the best tradeoff between low bias and low variance 
in the above expression. However, this decomposition does not correspond exactly to the classical 
bias-variance decomposition for linear inverse problems. Indeed, the variance term in (|2.3p is the 
sum of two terms and differs from the classical expression of the variance for linear estimator in 

2 A 2 

statistical inverse problems. Using our notations, the classical variance term is V\ = >^ , — ^ 



and appears in most of linear inverse problems. 

However, contrary to standard inverse problems, the variance term of the risk also depends 
on the Fourier coefficients 9k of the unknown function / to recover. Indeed, our data 7^ 1 c/ s are 
noisy observations of 0^: 



Ik'ck = 9 k +(^-l)9 k + -^6, 



and we invert the problem using the sequence (7fc)fceN instead of ('y^keN, which is involved in the 
construction of the coefficient c k . It explains the presence of the second term V2. In particular, 
the quadratic risk is expressed in its usual form in the case where % = j k - 

A similar phenomenon occurs with the model (|1.8p . although it is more difficult to quantify. 
Indeed, in this setting: 

%'ck = 9 k + h| - l\ 9 k + e%%, Vk E N. 

Hence, we also observe an additionnal term depending on 0. This term is controled using a Taylor 
expension but the quadratic risk cannot be expressed in a simple form. We refer to |Mar09| for a 
discussion with some numerical simulation and to |CH05| . |EK01j . [HR05] . [Mar06| and |CR07j . 



2.3 An oracle estimator and unbiased estimation of the risk 

Suppose that one is given a finite set of possible candidate filters A = (X n )n£ i, with X N = 
{X k ) kG z, JVe/cN which satisfy some general conditions to be discussed later on. In the case 
of projection filters, A can be for example the set of filters Aj^ = fl.|fci<jv> fc € Z for iV = 1, . . . , mo. 
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Given a set of niters A, the best estimator corresponds to the filter A , called oracle, which 
minimizes the risk R(X, 9) over A i.e. 



A := argmini?(A, 9). (2.4) 

AeA 

This filter is called an oracle because it cannot be computed in practice as the sequence of 
coefficients 9 is unknown. However, the oracle A can be used as a benchmark to evaluate the 
quality of a data-dependent filter A* chosen in the set A. This is the main interpretation of the 
oracle inequality that we will develop in the next section. 

Now, suppose that it is possible to construct an unbiased estimator 0| of \9k\ 2 - For any 
nonrandom filter A, using ©|, one can compute an estimator U(X,X) of the risk R(X,9). Then, 
for choosing a data-dependent filter, the principle of unbiased risk estimation (see |CGPT02j 
for further details) simply suggests to minimize the criterion U(\,X) over A E A instead of the 
criterion R(X,9). Our data-dependent choice of A is thus 

A* := argminC/(A,X). (2.5) 

AeA 

Typically, in practice, all the filters AeA are such that A^ = (or vanishingly small) for all k 
large enough. Hence, for such choices of filters, numerical computation of the above expression 
is thus feasible since it only involves the computation of finite sums. 

2.4 Oracle inequalities for projection niters 
2.4.1 Unbiased Risk Estimation (URE) 

For the sake of simplicity, we only consider spectral cut-off schemes in the following. In this 
case, A corresponds to the set of filters (H|fc|<jv)fcez f° r N £ N. All the results presented in this 
paper could be generalized to wider families of estimators (Tikhonov, Landweber, Pinsker,...). 
The price to pay is to get longer and more technical proofs. 

From Lemma |2~T1 the quadratic risk R(9,X) := R(9,N) of a projection filter can be written 

as: 



R(6,N) 



E n 2 + - E w~ 2 + - E n 2 (A-i) 

k\>N \k\<N \k\<N Vl ,Kl 7 

ml- E i^i 2 + - E i^i" 2 + 1 E iWA- 1 ) 

\h.\s*AT AT AT ^ ' n 1 / 



<N \k\<N \k\<N 

2 _ „,-2 



We aim to minimize R with respect to N while 9 is unknown. Using ® k = 7 fc 
as an unbiased estimator of |^| , we minimize U defined as 



l~ 12 e 2 



U(Y,N) 



i- 1 ) E i7,r 2 {N 2 --) + - E N" 2+i E iT.nki 2 --) 

n I L — ' n I n n ^-^ I n I 

/ \U\s* AT ^ J \l~\s*AT AT ^ ' 



\k\<N K ' \k\<N \k\<N 



(2.6) 

which is an unbiased risk estimator of R(9,N) — \\9\\2. 

Unfortunately, such a criterion does not lead to satisfying results. Instead of the approach 
developed in |CH05| . we take into account the error generated by the use of an approximation 
of the eigenvalues. The estimator related to the criterion t)2.6[) involves processes that require 
a specific treatment. In order to contain these processes, we will consider in the following the 
criterion 



u(y,n) = - e n- 2 {n 2 --)+- e i7,r 2 +^ E i7*r 2 {N--|, (2.7) 

\k\<N *• J \k\<N \k\<N *• ' 
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Remark that U (Y, N) can be written as U(Y, N) + pen(N) where (pen(iV))7veN denotes a penalty 
term. It appears from the proofs that this penalty is a natural candidate for the control of the 
processes involved in the behavior of the estimator constructed below. The associated data-based 
filter is defined as 

N* = arg min U(Y,N), (2.8) 

N<mo 



where 

m = inf { k : < } - 1. (2.9) 



,2 / lo S n 



n 

Remark that we do not minimize our criterion U(Y,N) over N but rather for N < mo. Indeed, 
each coefficient Ok is estimated by 7 ; T 1 c/ c where 7^ = Epyfc]. Hence, the ratio 7^ 1 7fc should be as 
close as possible to 1. Since 7^ — ► as k — > +00 and the variance of 7^ is constant in k, it seems 
clear that large k should be avoided. 

Similar bounds on the resolution level are used in papers related to partially known operator: 
see for instance |CH05j or [EKOlj . This bounds have to be carefully chosen but are not of first 
importance. In general, estimating the operator is easier than estimating the function /. 

2.4.2 Sharp estimator of the risk 

We are now able to propose a first adaptive estimator. In the following, we denote by 8* the 
estimator related to the bandwidth N* namely 

Q*k = -%<tv*}- (2-10) 

The next theorem summarizes the performances of 0* through a simple oracle inequality. The 
proof is postponed to the Section 5. 

Theorem 2.1 Let 9* defined by 12. 1 0]) and assume that the density g satisfies Assumption \2.1\ 
Then, there exists < 71 < 1 such that, for all < 7 < 71 , 

IM^-0|| 2 <(l + fci (7)) inf R(e,N) + Cie2 4 l +1 +—, (2.11) 

N<m n 7 4 P +i 717 

where 

R(6,N)= E l^l 2 + ^ E l7 fc r 2 + ^^ E fo*r 2 l**l 2 , (2-12) 

|fc|>7V \k\<N \k\<N 

hx(^f) — > as 7 — > and C\ denotes a positive constant independent of e and n. 

Prom Theorem 12. 11 our estimator 6* presents a behavior similar to the minimizer of R(9, N). 
This term only differs from the quadratic risk by a log term. This result can be explained by 
the choice of the criterion (12. 7p . The two last terms in the right hand side of (|2.11l) are at least 
of order 1/n and may be thus considered as negligible in most cases. 

In the next section, we prove that our estimator attains the minimax of convergence on many 
functional spaces. In particular, the log term and the bandwidth mo have no influence on the 
performances of our estimator from a minimax point of view. 



2.4.3 Rough estimator 

In the procedure described above, we have decided to take into account the error generated by 
the use of a the sequence (7fc)fceN instead of (jk)k£N- Although their setting is slightly different 
from ours, papers dealing with regularization with unknown operator consider implicitly this 
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error as negligible for the regularization. The goal is then to prove that the related estimator 
are not affected by the noise in the operator, i.e. this error is avoided in the oracle. 

It is thus also possible to apply a similar scheme in our setting and consider the bias enlight- 
ened in Lemma |2~T1 as negligible. We introduce 

R(9,N) = £ N 2 + ^ E N" 2 > ( 2 - 13 ) 

\k\>N \k\<N 

that corresponds to the usual quadratic risk in an inverse problems setting. 
Prom now on, our aim is to mimic the oracle for R(9,N), i.e 

N = are mm R(9,N). 

TVeN 

To this end, we use exactly the same scheme than for the construction of 9* starting from R(9, N) 
instead of R(6,N). Define 

u(y,n) = - £ i 7fc r 2 {ic fc i 2 --) + - E i^" 2 - ( 2 - 14 ) 

L — ' n I n L — ' 

|fc|<JV ^ J \k\<N 



Then, we introduce 



where tuq has been introduced in (|2.9p . Hence, this estimator only differs from the previous one 
by the choice of the regularization parameter N. The performances of 9 are detailed bellow. 



N = arg^min U(Y,N) and 9 k = ^ {k <fi } , (2-15) 



Theorem 2.2 Let 9 defined by 12.15\) and assume that the density g satisfies Assumption ^. 1 
Then, there exists < 72 < 1 such that, for all < 7 < 72, 



E 9 ||0~-0|| 2 <(l + M7)) mf R(9,N) + ^-( m2l °f {n) ) + + ^, (2.16) 

where ^2(7) ^0 as 7 — > and C2 denote a positive constant independent of e and n. 

We will see in Section 3 that the performances of 9* and 9 are essentially the same from a 
minimax point of view. The existing differences may be revealed by the comparison of the oracle 
inequalities obtained in Theorems 12.11 and 12.2} although this is always a difficult task. Since 
R(9, N) only differs from R(9, N) by a log term, we may be interested in the residual of order 
\\9\\ 2 . For fixed e and n, this term may have importance compared to R(9,N), in particular 
for large ||#|| 2 . Hence, the second estimator may be incongruous when estimating function with 
large norm. 

More carefully, 9 is a pertinent choice as soon as R(9,N) is close to R(9,N). This can be 
strengthened by the study of the quadratic risk defined in Lemma I2~T1 For instance, with a fixed 
e, this will be the case for function with 'small' Fourier coefficients (in particular small norms). 
On the other hand, as soon as e becomes 'small', the behaviour of R(9,N) and R(9,N) may 
strongly differs. This may produce significant differences on the performances of both 9* and 9. 



3 Minimax rates of convergence for Sobolev balls 

We provide in this section a short discussion about the performances of our estimator from the 
asymptotic minimax point of view. For this, let 1 < p, q < 00 and A > 0, and suppose that / 
belongs to a Besov ball Bp q {A) of radius A (see e.g. [DJKP95] for a precise definition of Besov 
spaces). |BGQ9| have derived the following asymptotic minimax lower bound for the quadratic 
risk over a large class of Besov balls. 
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Theorem 3.1 Let 1 < p, q < oo and A > 0, let p' = p A 2 and assume that: 

• (Regularity condition on f) f £ Bp q (A) and s > p' , 

• (Regularity condition on g) g satisfies the polynomial decay condition h2. 1\) at rate (3 for 
its Fourier coefficients, 

• (Dense case) s > (2(3 + l)(l/p - 1/2) and s> 2(3 + 1. 

Then, there exists a universal constant Mi depending on A,s,p,q such that 

inf sup E||/„ - /|| 2 > Mm^wi, as ra^oo, 

/n f<EBl q (A) 

where f n S L 2 ([0, 1]) denotes any estimator of the common shape f, i.e a measurable function of 
the random processes Yj, j = 1, . . . ,n 

-2s 

Therefore, Theorem 13.11 extends the lower bound n 2s+2 ' 3 + 1 usually obtained in a classical decon- 
volution model to the more complicated model of deconvolution with a random operator derived 
from equation (|1.2|) . Then, let us introduce the following smoothness class of functions which 
can be identified with a periodic Sobolev ball: 

H S {A) = \fe Lj([0, 1]) ; £(1 + \k\ 2s )\0 k \ 2 < a\ , 

for some constant A > and some smoothness parameter s > 0, where 9k = f$ e~ 2lk7TX f(x)dx. 

It is known (see e.g. [D JKP95] ) that if s is not an integer then H S (A) can be identified with a 

i 

Besov ball B^^iA'). Assuming / G H S (A) with s > 0, then the classical choice N* ~ n 2s + 2 > 3 + 1 
yields that 

R(9,N*)~ inf R(6,N) ~ n*°+w+i . 

N<mo 

l 

provided N* < mg. It can be checked that the choice (|2.9h implies that mo ~ n 2 P and thus for 



a sufficiently large n, we have that N* < tuq. Similarly the choice N* ~ ra 2 S +2/3+i yields that 

R(6,N*) ~ inf R(0,N*) ~ log 2 (n)n^W+r, 

7V<mo 

Now, remark that for the two estimators 9* and 6, both Theorems 12.11 and 12.21 yield that Ee||0* — 
9\\ 2 = O (infjv<m R(G,N)) and E ||(9 — 6"|| 2 = O (infjv<m R(0,N)) asm +oo, since additional 
terms in bounds (|2.11|) and (|2.16|) are of the order O(t^) for a sufficiently small positive (. 
Hence, combining the above arguments one finally obtains the following result: 

Corollary 1 Suppose that the density g satisfies the polynomial decay condition 112.1]) at rate (3 
for its Fourier coefficients. Then, as n — > +oo 

sup E e \\9* - 9\\ 2 ~ log 2 '' (n)n 2 °+ 2 P+ 1 
feH s (A) 



and 



sup Egp - 9\\ 2 ~ n^+^+i. 



feH„(A) 

Prom the lower bound obtained in Theorem l3.1l we conclude that, for s > 2(3+1, the performances 
of the estimator 9 are asymptotically optimal from the minimax point of view, while the estimator 
9* is near-optimal up to a log 2 (n) factor. This near-optimal rate of convergence of 9* is due to 
the use of the penalised criterion U(Y, N), see (|2.7p . with a penalty term involving a log J n ^ factor 
used to eliminate the term ^ ^|fc|<jv l7fc| _4 | ||cfc| 2 — ^| in the unbiased risk U(Y,N), see (|2.6p . 

This shows that the performances of 9* and 9 are essentially the same from a minimax point of 
view. 
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4 Numerical experiments 



For the mean pattern / to recover, we consider the smooth function shown in Figure [D(a). 
Then, we simulate n = 100 randomly shifted curves with shifts following a Laplace distribution 
g(x) = exp ^— \/2^pj with a = 0.1. Gaussian noise with a moderate variance (different to 
that used in the Laplace distribution) is then added to each curve. A subsample of 10 curves 
is shown in Figure W(b). The Fourier coefficients of the density g are given by 7^ = 1+2fT 2 7r 2 fc 2 
which corresponds to a degree of ill-posedness (3 = 2. 

The condition (12.91) thus leads to the choice mo = 32. Minimisation of the criterions (12.81) 
and (|2.15|) leads respectively to the choices A* = 13 and A = 30. An example of estimation by 
spectral cut-off using either the value of A* or A is displayed in Figure QJc) and Figure Hd). 
The estimator obtained with the frequency cut-off A^* = 13 is very satisfactory, while the choice 
A^ = 30 seems to be too large as the resulting estimator in Figure Hd) is not as smooth as the 
estimator with A* = 13. 




(c) (d) 

Figure 1: Wave function, (a) Mean pattern /, (b) Sample of 10 curves out of n = 100, (c) 
Estimation by spectral cut-off with N* = 13, (d) Estimation by spectral cut-off with N = 30. 
The dotted curve corresponds to the true mean pattern /. 

This result tends to suggest that minimising U (Y, N) leads to a smaller choice for the fre- 
quency cut-off than the one obtained by the minimisation of the criterion U(Y,N). This is 
confirmed by the results displayed in Figure [2] which gives the histogram of the selected val- 
ues for A* and N over M = 100 independent replications of the above described simulations. 
Clearly the value of A^* is generally much smaller than A, and thus minimising (|2.15p may lead 
to undersmoothing which illustrates numerically our discussion in Section [2] on the differences 
between 6* and 6. 

5 Proofs 

Proof of Theorem 12.11 The proof uses the following scheme. In a first time, we compute the 
quadratic risk of 6* and we prove that it is close to R(0,N*). The aim of the second part is to 
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(a) 



(b) 



Figure 2: Selection of the frequency cut-off over M = 100 replications of the simulations (with 
too = 32): (a) Histogram of the selected value for iV*, (b) Histogram of the selected value for N. 



prove that U(Y,N*) is close to R(9,N*), even for a random bandwidth N*. Then, we use the 
fact that N* minimizes the criterion U(Y,N*) over the integer smaller than too and we compute 
the expectation of U(Y, N) for all deterministic N in order to obtain an oracle inequality. 
In a first time, 



K 9 Yl \% l£ k-0k 
\k\<N* 



+ E Yl 

\k\>N* 
2 

E, £ 

|fc|<Ar* 



7fc 



|fc|>7V* 

i^i 2 + -e 9 x; i&i%r 2 

n 

|fc|<JV* 



+E e ^ |# fc | 2 + 2E e ^ -^i?e (( 7 - 1 7 fc - l)0 fe X 7-^) 

|fc|>Ar* |fc|<7V* * 



\k\>N 

where for a given z E C, Re(z) denotes the real part of z and z the conjuguate. The last equality 
can be rewritten as 



E e \\9*-9\\ 2 = E g R(9,N*)+E e ^ 

\k\<N* 



Ik 
Ik 



\k\<N* 



+2E e J2 -^Re^lk-^kXlkHk) 

\k\<N* Vn 

= E e R(9,N*)+A 1 +A 2 + A 3 , 
where R(9,N) is defined in (|2.13p . Thanks to Lemma I5TT1 setting K = 1, 



A 1= E e J2 

\k\<N* 



Ik 
Ik 



1 



< 



log 2 (n) 



\k\<N* 



c 



n 



(5.1) 



(5.2) 



Now, consider a bound for A 2 . For all N G N set Xtv = |fc|<7V l7fcl 4 - Then, for all p G]l,2[ 
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and 1 > 7 > 0: 



,4, 



-E e E hk\- 2 m 2 -l), 

|fc|<iV* 



— Ee 

n 



E b*r 2 (i* 



fc i 2 -i)- 7A /s^ p 



n 



< — Eg sup 

n at 



E i7 fc r 2 (ie fc | 2 -i)-7^ p 

|fe|<JV 

< 7 -E^ P + 
n 



+ 7 -E ev ^ P , 
n 



The last step can be derived from a Doob inequality: see for instance |CG06j . Thanks to the 
polynomial Assumption 12.11 on the sequence (jk)k an d setting p = 2 x (2(3 + 1)/ (4/3 + 1), we 
obtain 

A 2 = e -E e E |7/cr 2 (N 2 -l)<7-E e E \^\~ 2 + -M+i-- ( 5 - 3 ) 

|fc|<JV* |fc|<JV* 



n 



Then, for all 1 > B > 0, using the Cauchy-Schwarz and Young inequalities with the bounds (|5.2|) 
and (E3D 



— a/ 

|fc|<iV* V 



< B-E e E \lk\~ 2 M 2 + B^Ee E N 2 

|fc|<Ar* I fc| <iV* 



Ik _ j 

7fc 



Thus, for any if > 0, 

*<(S + B 7 )£e, E l^ + ^^E, E ^l" 2 N 2 + ^T + ^- 

\k\<N* \k\<N* ' 



With 5 = VK = ^7, we obtain from (jO)-(Q 



E ||r-0|| 2 < (1 + 7 + 2^7)^5(0, AT*) + 



Ce 2 



n7 



+ n' 



(5.4) 



(5.5) 



where R(9,N) is defined in (|2.12p . This concludes the first step of our proof. Now, we write 
U(Y, N*) in terms of R(9, N*). In the following, we define x n = (1 — n -1 ). We have 

U(Y t N*) ^ 2 < l0g2(n) 



n E i7 fc r 2 {ic,i 2 --) + - E i7 fc r 2 + ^^ E iTfci- 



|&|<N 

R(9,N 



»-;) E w- 2 {' c "'' 2 -C}- E i" 

7 |fc|<7V* v 7 |fc|>iV* 



|fc|<iV 
2 



Cfe 



+ 



log 2 (n) 



n 



E 

|fc|<7V* 



ck\ — > - hk\ \0k\ 

n 



This equality can be rewritten as 

R(9,N*) = U(Y,N*) + \\6\\ 2 + x n E {\lkV 2 \ck\ 2 -^\lk\- 2 \- E 1**1 

\k\<N* ^ 7 \k\<N* 

log 2 (ra) 



+ - 



E 

|fc|<JV* 



Cfc 



n 



. (5.6) 
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For all k <E N 



and 



Since x n < 1 



|c fc | 2 = \6 k %\ 2 + — | Cfc | 2 + 2en- 1 / 2 Re(e k %£ k ), 
n 



|-yjfe| _2 |^b| 2 = 10*1" 



I* 



2 2 

+ -\lkC*W + 2^ r \ lk \-' 1 Re{d k %i k) 
n \/n 



nE, £ {i 7 r 2 N 2 -^r 2 }- £ 



\k\<N 

< e 9 e n 2 ( 

|fc|<iV* 
= E\ + E2 + -E3. 



7fe 



+ -^n V |7fer 2 (|6| 2 -l) + 2-^x n V \j k \- 2 Re{9 k %Ck), 

(5.7) 



First consider the bound of E\ . Thanks to Lemma 15.21 and some simple algebra 



Et = E e £ \6 k \ 

\k\<N* 
J2 



Ik 

Ik 



< 2 7 i^E e £ |^| 2 | 7fc r 2 +7lE e £ \0 k \ 

\k\<N* \k\>N* 



+7 £ l^| 2 +7" 1 £ |^| 2 |7 fc r 2 (l-|7 fc P 

|fc|>7V |fc|<jV 

< 2 7 E e i?(0,iV*) + f 7 +_^ 7 - T > )5(fl,JVo) + ^, 



n 7 



2 ■ 



where 



log^ra) 



iV = arg min R(6,N). 

N<mo 



n 7 i 



The terms E2 and E3 are bounded using respectively (15. 3h and Lemma [5731 We get 

u, £ {i7 fc r 2 ic fc i 2 -^i 7 ,r 2 -i^i 2 } 



< D^E e R(9, N*) + D-yR(6, N ) + 



e 2 C C 
+ 



We are now interested in the second residual term of (|5.6p . Thanks to the definition of c k : 

■2 



(5.8) 



log 2 n 



n 



^ e i7 fc r 2 {-i7 fc r 2 ic fe i 2 + ^i7 fc r 2 + i^i 2 } 



\k\<N 

E e E hk\- 2 \dk\ 2 (i 

\k\<N* V 



Tfc 
Ik 



+ - £ i7 fc r 4 (i-iai 2 )-2^ e I7*r 4 i2e(<?fc7fcefc), 

n ^-^ \/n ^-^ 

\k\<N* v \k\<N* 



< D-fE e R(6, N*) + DjR(6, N ) + 



e 2 C C 



n 7 4 / 3 + 1 n^f 2 ' 



(5.9) 



for some D > independent of e and n. Indeed, we can use essentialy the same algebra as for the 
bound of the terms E\,E2 and £"3 and the inequality 



i7,r 2 < 



n 



log 2 n 
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Hence, using (|5.8p and (|5.9j) 

(1 - D<y)EgR(9, N*) < E e U(Y, N*) + \\9\\ 2 + D>yR(8, N ) + + — "i- (5-10) 

n7 z n 

>Prom the definition of N*, we immediatly get 

(1 - D<y)E e R(0, N*) < E e U(Y, N ) + ||0|| 2 + D>yR(0, N ) + + — 

where Nq denotes the oracle bandwidth. Since 

E e U(Y,N ) = R(d,N )-\\ef, 

we obtain 2 

(1 - D 7 )E e R(e, iV) < (1 + Dj)R(e, N ) + -^ 1 + — -jL;. (5.11) 

Using (|5.5|) and (|5.1ip . we get: 

E e \\e*-e\\ 2 < (i+D^)E e R(e,N*) + —^- T + —, 

n ^P+ L 717 
1 — 1J7 J n 7 4 p+ x nj 



This concludes the proof of Theorem 12.11 



□ 



Proof of Theorem 12.21 The proof follows the same main lines as for Theorem 12.11 Inequality 
(|5.1|) provides: 



2 



E g \\d*-ef = E R(6,N*)+E e £ |0 fc | 2 + ^E e £ N^d^l 2 - 1) 

+2E * E ^((7^-1)^x7-^), 

|fc|<JV* v n 
= E e R(6,N*)+A 1 +A 2 + A 3 . 

Thanks to Lemma ED and an inequality of [CGPT02] . we obtain for all < 7 < 1: 

e 2 C 
A x < log 2 (n)-E e sup | 7fc |- 2 |0 fc | 2 + - 

n \k\<N* n 

< AlW-^f^f^. (5-12) 
n n \ 7 / n 

\k\<N* V ' 7 

Then, for all B > 0, using the Cauchy-Schwarz and Young inequalities with the bounds (|5.2p . (|5.3|) : 



A 3 = 2E e £ ^((7^-1)^x7^) 

\k\<N* \ 1 / 1 

With the choice B = ^7, we obtain from (I5.1I) - (I5.4I) : 

E 9 \\e* - ef < (i + 3 7 + Vt)M(*, ^) + — f l|g||2l ° g2(n) ) 2 " + 4£t + - ■ ( 5 - 14 ) 

n \ 7 / nj^P +1 n 
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Then, 



u<y,N*) = - e {i7,i- 2 ic fc i 2 --i7 fc |- 2 ) + - E i^r 2 » 

I n n 

\k\<N* K J \k\<N* 

E Ww-^r 2 }- E n 2 + E n 2 + ^ E w~ 2 , 



|fc|<AT* 



|fc|>AT* 



|fe|<JV* 



A(e,jv*)- e {i7 fe r 2 ic fc i 2 -^i7 fe r 2 }- E 1**1 



|fc|<JV 

This equality can be rewritten as 



R(8,N*) = U(Y,N*) + \\9\\ 2 + E {l7*r 2 |c fe | 2 



\k\<N* 



re 



■hk\~ 2 -\o k \ 



Hence, 



E e R(0,N) = E e U(Y,N*) + \\8\\ 2 + E e E 



|fc|<7V* 



Tfe 
Ik 



+- E l7 fc r 2 (l6d 2 -i) + 2^ E hk\- 2 Re(e k %Ck) 

\k\<N* v \k\<N* 



= EoU(Y,N*) + \\e\\ 2 + E 1 +E 2 + E 3 . 
Using previous results: 



(5.15) 



E 1 = E e E \0h? 

\k\<N* 



Ik 
Ik 



1 



C Ce 2 

< 2jE e R(9,N*)+jR(9,N ) + - + 



12 log 2 (re) 



re n 

The terms E 2 and £3 are bounded using respectively (|5.3|) and Lemma I5T31 We get: 
E e R(9, N*) < E e U(Y,N*) + \\0\\ 2 + D7E e R(e,N*) + D 1 R{6, N ) 



(5.16) 



C Ce 2 
+ — — + 



717 



2;3 



n 



|2 1 — ,2 



23 



Hence, 



{l-D~f)E B R{0,N*) < E e U(Y,N*) + \\9\\ 2 +DjR(9,N ) + -^ + C ' 



717 



2/3 



11 



12 1 — 2 



\og\n) 



>Prom the definition of N*, we immediatly get: 



(l-D 1 )E e R(9,N*)<E e U(Y,N ) + \\9\\ 2 +D 1 R(9,N ) + - 



C Ce 2 



In order to conclude the proof, we prove that EqU (Y, iVo) is close to R(9, No). First remark that: 

Nq ( .2^ 2 N ° 



+ - 



T 



|2 1 — 2 



2;3 



. (5.17) 



log (re) 



2c* 



(5.18) 



E e C/(F,iVo) 



iv r 2 ^1 £ 2 JV 

En- 2 n 2 -- +-Ew 

^— ' re re 

fe=i ' fc=i 

E|^r 2 ^i 2 -i7 fe |- 2 --i^ 



N 2 AT„ 

Ei^ + ^En 



k=l 



k=l 
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Since for all k G N: 



rt / n 



we obtain, 



iv 



E t/(y,iVo) = -^|^| 



:hk\~ 2 



Therefore, 



n 



2 JV 

+ E i'*i 2 + ^5>r 2 



|fe|>AT 



fc=l 



'+ J R(^iVo)-||e|| 2 < J R(0,iVo)-||0|| 2 , 



fc=i 



n 



and 



(1 - D7)M(6>, N) < (1 + D 7 )i?(0, JV ) + 4l + — ^ " e|12 bg2(n) ' 



Using (|5.5j) and (I5.19p . we get: 



E e \\e*-e\\ 2 < (i + D^)E e R(e,N*) + 

Since R(9,N) < R(9,N), we eventually get: 



Ce 2 



v 



| 2 log 2 (n)\ 2/3 Ce 2 1 1 



(5.19) 



n ^ 


7 2 






|0|| 2 log 


2 (n) 


n \ 


7 2 




Ce 2 t 


\ef log 


2 (n) 


n \ 


7 2 





+ 



+ 



n 


7 4/3+l 


Ce 2 


1 


n 


7 4/3+l 


Ce 2 


1 


n 


7 4/m 



+ -, 

n 



+ -■ 

n 



This concludes the proof of Theorem 12.2 



□ 



Appendix 

Lemma 5.1 For all K > 0, we have 



|fc|<7V* 



7ft 



n 2 <^^ie, E i^i- a i^i a +^. 



|fc|<./V* 



where C denote a positive constant independent of e and n. 
PROOF. Let Q > a deterministic term which will be chosen later. 

2 



* E 

|fc|<JV 



7* 



1 



7ft 



\9 k \ 2 = E e £ |^| 2 | 7fc r 2 |7fc-7A:| 2 , 

|fc|<iV* 

< QE e £ |0 fc | 2 | 7fc |- 2 +E e £ |^fc| 2 |-y*l~ 2 {l^fe — T*| 2 - Q} ^kl^fc— yfcP>Q>- 
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Thanks to M and JM 



E \°k\ 2 hk\ 2 {l7fc-7fc| 2 -Q}a{|7 fe - 7fe | 2 >Q} 



c- 



)1 



log^n) 



E N 2]E * (fa " T*! 2 " ^) % fc -7 fe | 2 >Q}- 



|fc|<mo 



For all | A; | < mo, using an integration by part 

E e [|7fc-7fc| 2 -Q] H { | % _ 7fc |2 >Q} 



P(\% -lk\ 2 > x)dx. 



Let x > Q. A Bernstein type inequality provides 

m7fc-7fc| 2 >*) = P 

< 2 exp 

< 2 exp 

Hence, for all \k\ < mo, 



( \ E [e~ 2i7TkTl - E[e~ 2 ^ fcTi ]} > y/^j , 



2 E"=l Var(e- 2 ^ fcT + 

(ny/x) 2 \ 
2n + n^/x/3 J 



/•+oo r 

E e [l7fc"7fc| 2 -Q] % fc - 7fc | 2 >Q} < _/ exp \~ 2 + Vi/3 ) d 



< C % exp (- — \ + l +0 ° exp {-Cn^i < -e~ Qn /\ 
Jq I 4 J J 36 n 



where C denotes a positive constant independent of Q. Let K > 0. Choosing for instance 
Q = n _1 .ff log 2 (ra), we obtain 



E 

|fe|<iV* 



Tfc _ 1 

7fc 



n 2 < k^^e, £ i7,r 2 N 2 + ^v 



n 



\k\<N* 



log (n) 



|fc|<JV* 



where C denotes a positive constant independent of e and n. This concludes the proof of Lemma 

□ 

Lemma 5.2 Let N* defined in 112. 8\) . For all deterministic bandwidth N and < 7 < 1, we 
have 



E l^l 2 

\k\<N* 



Ik 
Ik 



1 < 2 7 



log 2 (n) 



n 



E e E l^| 2 |7fc|~ 2 + 7E 9 E \° k \ 



\k\<N* 



|fc|>7V* 



c 



+7 E i^i 2 + — E N 2 i7 fc r 2 (i-w 2 ) + 

\k\>N \k\<N ' 



where C denotes a positive constant independent of e and n. 
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PROOF. In a first time, remark that 

2 \ 



^ £ |0 fc | 5 

|fc|<JV* 



7fc 



1 



£ I^| 2 |7 fc r 2 (l7 fc -7fe + 7fc| 2 -|7fc| 2 ), 

|fc|<7V* 

^9 Y \ k\ 2 hk\- 2 {\%-lk\ 2 + 2Re((%- lk )%)}{5:2O) 



|fc|<7V* 



Let N € N be a deterministic bandwidth. Since ^elk = 7fc for all k £ N, we can write that 

E e £ |^| 2 |7fe|~ 2 ^e((7 fc -7fc)7fc) 
|fc|<iV* 

= Ee Y \8k\ 2 hk\- 2 Re((%-~/k)%), 

\k\e{N...N*} 

< E e Y \Qk\ 2 \lk\- 2 Re{{%-ik)lk 

\k\<={N...N*} 

< \($-{\k\<N*} - ti { ik\<N})\9k\ 2 hk\~ 2 Re((% ~ 7fe)7fe)| • 
fcez 

Using simple algebra 

|l{|fc|<Af*} - l{|fc|<7V}| = | (IL{|At|<7V*> + l{|fc|<AT})(l{|fc|<Ar*} - fl{|fe|<AT})| 

= (l{|fc|<AT*} + l{|fc|<7V}) |i{|fc|>Ar*} - l{|fc|>7V}| , 
< l{|fc|>AT*}l{|fc|<AT} + fl{|fc|<yV*}l{|fc|>7V}- 

For all 7 > 0, using the Cauchy-Schwartz and Young inequalities, we obtain 

Y \Qk\ 2 \lk\- 2 Re{{lk-lk)%) 
\k\<N* 

< E e Y a {|fc|>JV*}a{|fc|<7V}|6'fe| 2 |7fcr 2 ^e((7 fc - 7fc )7fc) 



fcez 



+E e £ t{\k\<N*}^{\k\>N}\8k\ 2 \lk\ 2 Re((lk ~ lk)lk) 



k&L 



< 7 E e Y \ e k? + l Y l^| 2 + 7^% Y l^| 2 | 7 fc|~ 2 |7fc-7fc|' 

|fc|>iV+ \k\>N \k\<N 

+7"% Y l^| 2 |7fcl~ 2 |7fc-7fc| 2 - 



(5.21) 



|fc|<AT* 



Hence, from (l5T2u]l and (|5T2Tjl 



Y i^i 

|fc|<7V* 



77% 

Ik 



1 < (1 + 7 _1 )E* ^ |^| 2 |7fc|- 2 |7fc-7fc| 2 + 7lEe £ N 2 
/ \k\<N* \k\>N* 

+7 E i^i 2 +7-% Y i^i 2 i7 fc r 2 i7 fc -7A:i 2 . 

|fc|>JV \k\<N 

A direct application of Lemma IBTTl provides, for all K > 



|fc|<Ar* 



7* 



1| < (l + 7 -1 )^ 



.log 2 (n) i 



E e E |^| 2 | 7fe r 2 + 7E e Y \° k \ 

\k\<N* \k\>N* 

+7 Y i^i 2 + 2 ^ £ i^i 2 i7,r 2 (i-i7 fe i 2 )+ c 



|fc|>7V |/c|<JV 

Just set K = 7 2 in order to conclude the proof of Lemma 15.21 



nK 
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□ 

Lemma 5.3 Let N* the bandwidth defined in 112. 8\) . For all deterministic bandwidth N and 
< 7 < 1, we have 

^=E e E \ik\- 2 Re{9 k %t k ) < 37 1 E M + ~ E W 

V |fc|<7V* Ufc|>JV I A; | < A^ 

+37log 2 (n)EJ E l^| 2 + T7 E wl + ;7 + ^ 



|fc|>7V* |/c|<iV 
PROOF. In the following, we will use the inequality: 



\k=l 



< 



Ik 



< 2 } I < exp(-log 1+r ra 



for some r > 0, wich can be proved using a Bernstein type inequality. Then, for all 7 > 0, using 
the above result and inequality (4.31) of |CGQ6| . we obtain 

—7=^0 E \^\~ 2 Re{e k %h) < 7 E M 2 + -^e E hk\~ 4 \%\ 2 

v \k\<N* \\k\>N \k\<N 

(jfc|>AT* |fe|<JV* J 

In order to prove the above inequality, we use the inequality (4.31) of |CG06j and Since "&elk = 7fc> 

e \ e \lk\-'\lk\ 2 = I E l7 fe |- 4 {^fe| 2 -|7 fc | 2 + l7 fe | 2 }, 
|fc|<7V |fc|<JV 

= ^ E I7 fc r 4 {l7 fc | 2 +Var(7 fc )}, 

|fe|<JV 



- E l7 fc r 4 {l7,| 2 + 1 )<2^ E \Tk\- 



|fc|<AT v y I fc I < A^o 

The same kind of inequality can be obtained with the random bandwidth N*. Indeed, 

-Eg E l7fc|- 4 |7 fc | 2 = -Efl E l7 fc |~ 4 |7fc-7 fc +7A : | 2 , 

\k\<N* \k\<N* 

< e \ e E |7 fc r 4 {2|7 fc -7 fc | 2 + 2| 7fc | 2 }, 

\k\<N* 

< —Ee E l7,r 2 + — E e E l7 fc r 4 |7 fc -7,| 2 . 

|fc| <^V* |fc|<Af* 
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Using the same algebra as in the proof of Lemma 5.1, we obtain, for all Q > 0: 
j 

'-^0 E It*' 



t „ — 4 1 1 ~ 1 2 

n , f— ' 

|fc|<JV* 



2 2 

= Q e -®o E M~ 4 + ^ £ | 7fc |-4{ fe _ 7fc |2_Q} %fc _ 7fc|2>Q}) 

\k\<N* |fc|<iV* 

e 2 Ce 2 n 
^ Q-®o E l7fcl ~ 4 + i^27^ Ee E l7fc-7fc| 2 a { |7 fc -7 fc P>Q}' 

\k\<N* 6 1 ; |fc|<JV* 



e 2 ^ , Ce s 



< Q-E e V |7fc|" 4 + 

r). ^ — ' 



71 *t?N* bg (n) 



Setting Q = n 1 log 2 (n), we obtain, 



-e 



e2 w I —4i I— i2 / e2 m i i-2 l7fcl 2 log 2 (n) Ce 2 

|fc|<AT* |fc|<7V* 

< -e* E w- 2 + — • 

|fc|<AT* 



This concludes the proof. 



□ 
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